Lake St. Croix Water Quality Redundancy Analysis

Background

Project Info

This document compares data collected in Lake St. Croix by the Great Lakes Network (GLKN) as part of their Water Quality (Rivers) monitoring protocol with data collected by the Metropolitan Council’s Environmental Services (METC) Department.

For GLKN, we used the most recently published 2006 – 2024 Large Rivers Water Quality Monitoring Data Data Package (IRMA record 2309369). For METC, we downloaded all data from 2006 and later from the METC’s Environmental Information Management Systems (EIMS) web server for all monitoring locations within Lake St. Croix.

Based on the USGS National Water Dashboard, Lake St. Croix is bounded by two USGS stream gages, which can be used to model discharge. USGS 05341550 is at the start of Lake St. Croix, very close to METC_STCR_23.6; USGS 05344490 is at the confluence near METC_STCR_0.1 and SACN_STCR_2.0.

All code use for this analysis are posted on GitHub in the following repository: https://github.com/KateMMiller/GLKN_MCES_analysis

Sites included in this analysis are plotted in the map below.

Code Prep

Install waterGLKN R package

The waterGLKN package is still a WIP, but at least makes it easier to import and query of GLKN water data using the format of the latest rivers and inland lakes data packages. Must also have RTools44 installed, which can be installed via the Software Center, and the devtools R package.

devtools::install_github("katemmiller/waterGLKN")
library(waterGLKN)

Troubleshooting github-installed packages:

If you’re unable to install the R package via GitHub (often an error about permission being denied, download the following script and open it in R: fix_TLS_inspection.R

Once open in R Studio. Press Control + A to select all of the code. Then Control + Enter to run all of the code. Assuming you don’t return any errors, you should now be able to install from GitHub. Repeat the install_github code above, which hopefully will successful install waterGLKN package.

Load other dependencies and imports

The params csv was developed by GLKN staff manually matching parameter names between GLKN and METC datasets. The csv is posted on the GitHub repo. Note that the waterGLKN package already adds a column of parameter abbreviations following the same naming convention as the params.csv below.

library(tidyverse)
library(knitr)
library(kableExtra)
library(DT)
params <- read.csv("https://raw.githubusercontent.com/KateMMiller/GLKN_MCES_analysis/refs/heads/main/data/GLKN_vs_METC_params_list_final.csv")

Compile Data

Download Data Package and import into R

Download the latest GLKN Rivers Data Package. The easiest way to do it is to download all as a zip. The file path below is where that zip is stored on Kate’s machine and the name of the zip file. Then import the data package into R using the waterGLKN function importData().

library(waterGLKN)
river_zip = "../data/GLKN_water/records-2309369.zip"
importData(type = 'zip', filepath = river_zip)

Filter Results.csv by Lake St. Croix sites and prepare data for binding with METC data.

# Site list
lksc <- c("SACN_STCR_20.0", "SACN_STCR_15.8", "SACN_STCR_2.0")
# pull in all results for Lk St. Croix sites
sacn <- getResults(park = "SACN", site = lksc, sample_type = "VS",
                   parameter = 'all',
                   months = 1:12,
                   sample_depth = 'all',
                   include_censored = T,
                   output = 'verbose') |>
  select(Org_Code, Park_Code, Location_ID, Location_Name, Activity_Type, Activity_Comment, sample_date, doy, year, month,
         depth_cat = Activity_Relative_Depth, depth = Activity_Depth, depth_unit = Activity_Depth_Unit,
         PARAMETER = Characteristic_Name, param_name, value, unit = Result_Unit, Filtered_Fraction, Result_Comment, 
         censored, Result_Detection_Condition, Method_Detection_Limit, Lower_Quantification_Limit, Upper_Quantification_Limit)

# Use Result_Comment to extract censored value
sacn$value_cen <- suppressWarnings(
  ifelse(sacn$Result_Detection_Condition %in% c("Present Above Quantification Limit", "Present Below Quantification Limit"),
           as.numeric(str_extract(sacn$Result_Comment, "\\d*\\.*\\d+")), sacn$value))

sacn$param_name[sacn$param_name == "ChlA_ppb"] <- "ChlA_ugL"
sacn$unit[sacn$param_name == "ChlA_ppb"] <- "ug/l"

# convert depth in ft to m (they're actually all NULL, but in case other sites are in ft for later analysis)
sacn$depth[!is.na(sacn$depth) & sacn$depth_unit == "ft"] <- 
  sacn$depth[!is.na(sacn$depth) & sacn$depth_unit == "ft"] * 0.3048
sacn$depth_unit[sacn$depth_unit == "ft"] <- "m"

# Check for duplicates in sacn
sacn_dups <- sacn |> group_by(Location_ID, sample_date, depth, Activity_Type, param_name) |> 
  mutate(num_samps = sum(!is.na(value))) |> filter(num_samps > 1) |> 
  arrange(Location_ID, sample_date, depth, param_name) |> 
    select(Location_ID, sample_date, depth_cat, depth, Activity_Type, PARAMETER, param_name, value, 
           unit, num_samps, Filtered_Fraction, Activity_Comment, Result_Comment, Result_Detection_Condition)

Duplicates in the SACN data

In general values are very close. Sometimes one is total and the other dissolved. Other times the dups are the same category. Going to take only the first per group, after sorting by Activity_Comment.

sdk <- kable(sacn_dups, format = 'html', align = c(rep('c', 4), 'l', 'l', 'l', 'c', 'c', 'l', 'l', 'l'), 
             caption = "Duplicate measurements within the same site, date, depth, and parameter. ") |> 
       kable_styling(fixed_thead = TRUE, bootstrap_options = c('condensed'), 
                     full_width = TRUE, position = 'left', font_size = 12) |> 
       collapse_rows(1:7, valign = 'top')
Duplicate measurements within the same site, date, depth, and parameter.
Location_ID sample_date depth_cat depth Activity_Type PARAMETER param_name value unit num_samps Filtered_Fraction Activity_Comment Result_Comment Result_Detection_Condition
SACN_STCR_15.8 2014-04-17 Surface Sample-Integrated Vertical Profile Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL 31.00000 mg/l 2 Total Organization ActivityID=SACN_STCR_15.8_201404170014 Reported Measure Qualifier: MTRX Detected and Quantified
25.00000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_15.8_201404170015 Detected and Quantified
Calcium Ca_mgL 9.10000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_15.8_201404170014 Detected and Quantified
9.20000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_15.8_201404170015 Detected and Quantified
Chloride Cl_mgL 4.40000 mg/l 2 Total Organization ActivityID=SACN_STCR_15.8_201404170015 Detected and Quantified
4.30000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_15.8_201404170014 Detected and Quantified
Potassium K_mgL 2.16000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_15.8_201404170015 Detected and Quantified
2.00000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_15.8_201404170014 Detected and Quantified
Magnesium Mg_mgL 3.40000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_15.8_201404170014 Detected and Quantified
3.20000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_15.8_201404170015 Detected and Quantified
Sodium Na_mgL 2.59000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_15.8_201404170015 Detected and Quantified
2.80000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_15.8_201404170014 Detected and Quantified
Sulfur, sulfate (SO4) as SO4 SO4_mgL 3.60000 mg/l 2 Total Organization ActivityID=SACN_STCR_15.8_201404170015 Detected and Quantified
4.10000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_15.8_201404170014 Detected and Quantified
Silicate Si_mgL 7.59000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_15.8_201404170015 Detected and Quantified
8.10000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_15.8_201404170014 Detected and Quantified
Solids, Suspended (TSS) TSS_mgL 8.00000 mg/l 2 Non-Filterable (Particle) Organization ActivityID=SACN_STCR_15.8_201404170015 Detected and Quantified
7.00000 mg/l 2 Non-Filterable (Particle) Organization ActivityID=SACN_STCR_15.8_201404170014 Detected and Quantified
2014-07-02 Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL 59.00000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_15.8_201407020017 Detected and Quantified
65.00000 mg/l 2 Total Organization ActivityID=SACN_STCR_15.8_201407020016 Detected and Quantified
SACN_STCR_2.0 2009-07-07 Sample-Routine Nitrogen N_ugL 945.59000 ug/l 2 Total Near-surface water collected with 2-meter integrated sampler; Near-bottom sample collected with Van Dorn sampler at 13.0 meters Organization ActivityID=SACN_STCR_2.0_20090707_000018 Detected and Quantified
973.36000 ug/l 2 Total Near-surface water collected with 2-meter integrated sampler; Near-bottom sample collected with Van Dorn sampler at 13.0 meters Organization ActivityID=SACN_STCR_2.0_20090707_000018 Van Dorn Detected and Quantified
Phosphorus as P P_ugL 75.31000 ug/l 2 Total Near-surface water collected with 2-meter integrated sampler; Near-bottom sample collected with Van Dorn sampler at 13.0 meters Organization ActivityID=SACN_STCR_2.0_20090707_000018 Detected and Quantified
74.58000 ug/l 2 Total Near-surface water collected with 2-meter integrated sampler; Near-bottom sample collected with Van Dorn sampler at 13.0 meters Organization ActivityID=SACN_STCR_2.0_20090707_000018 Van Dorn Detected and Quantified
2014-04-17 Surface Sample-Integrated Vertical Profile Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL 63.00000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_2.0_201404170019 Detected and Quantified
72.00000 mg/l 2 Total Organization ActivityID=SACN_STCR_2.0_201404170018 Detected and Quantified
Calcium Ca_mgL 19.00000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_2.0_201404170018 Detected and Quantified
18.80000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_2.0_201404170019 Detected and Quantified
Chloride Cl_mgL 8.00000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_2.0_201404170018 Detected and Quantified
8.40000 mg/l 2 Total Organization ActivityID=SACN_STCR_2.0_201404170019 Detected and Quantified
Carbon, organic DOC_mgL 8.00000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_2.0_201404170018 Detected and Quantified
6.50000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_2.0_201404170019 Detected and Quantified
Potassium K_mgL 2.20000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_2.0_201404170018 Detected and Quantified
2.27000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_2.0_201404170019 Detected and Quantified
Magnesium Mg_mgL 7.00000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_2.0_201404170018 Detected and Quantified
6.40000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_2.0_201404170019 Detected and Quantified
Sodium Na_mgL 4.60000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_2.0_201404170018 Detected and Quantified
4.56000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_2.0_201404170019 Detected and Quantified
Sulfur, sulfate (SO4) as SO4 SO4_mgL 6.10000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_2.0_201404170018 Detected and Quantified
5.10000 mg/l 2 Total Organization ActivityID=SACN_STCR_2.0_201404170019 Detected and Quantified
Silicate Si_mgL 11.00000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_2.0_201404170018 Detected and Quantified
10.90000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_2.0_201404170019 Detected and Quantified
Solids, Suspended (TSS) TSS_mgL 6.00000 mg/l 2 Non-Filterable (Particle) Organization ActivityID=SACN_STCR_2.0_201404170018 Detected and Quantified
5.90000 mg/l 2 Non-Filterable (Particle) Organization ActivityID=SACN_STCR_2.0_201404170019 Detected and Quantified
2014-07-02 Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL 61.00000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_2.0_201407020021 Detected and Quantified
65.00000 mg/l 2 Total Organization ActivityID=SACN_STCR_2.0_201407020020 Detected and Quantified
Midwater Phosphorus as P P_ugL 76.73107 ug/l 2 Total TP-Van Dorn Organization ActivityID=SACN_STCR_2.0_201407020023 Van Dorn Detected and Quantified
Surface 78.78227 ug/l 2 Total Organization ActivityID=SACN_STCR_2.0_201407020022 Analysis Date: 140926/141010 Detected and Quantified
2020-07-07 Midwater 24.80700 ug/l 2 Total Organization ActivityID=SACN_STCR_2.0_20200707000002 Detected and Quantified
36.79500 ug/l 2 Total Organization ActivityID=SACN_STCR_2.0_20200707000002 Detected and Quantified
SACN_STCR_20.0 2009-07-07 Sample-Routine Nitrogen N_ugL 1040.93000 ug/l 2 Total Near-surface water collected with 2-meter integrated sampler; sampler used to collect water at 9.1 meters Organization ActivityID=SACN_STCR_20.0_20090707_000014 Van Dorn Detected and Quantified
678.84000 ug/l 2 Total Near-surface water collected with 2-meter integrated sampler; sampler used to collect water at 9.1 meters Organization ActivityID=SACN_STCR_20.0_20090707_000014 Detected and Quantified
Phosphorus as P P_ugL 99.77000 ug/l 2 Total Near-surface water collected with 2-meter integrated sampler; sampler used to collect water at 9.1 meters Organization ActivityID=SACN_STCR_20.0_20090707_000014 Van Dorn Detected and Quantified
35.00000 ug/l 2 Total Near-surface water collected with 2-meter integrated sampler; sampler used to collect water at 9.1 meters Organization ActivityID=SACN_STCR_20.0_20090707_000014 Detected and Quantified
2014-04-17 Surface Sample-Integrated Vertical Profile Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL 27.00000 mg/l 2 Total Organization ActivityID=SACN_STCR_20.0_201404170016 Detected and Quantified
26.00000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_20.0_201404170017 Detected and Quantified
Calcium Ca_mgL 8.70000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_20.0_201404170016 Detected and Quantified
8.90000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_20.0_201404170017 Detected and Quantified
Chloride Cl_mgL 4.20000 mg/l 2 Total Organization ActivityID=SACN_STCR_20.0_201404170017 Detected and Quantified
5.10000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_20.0_201404170016 Detected and Quantified
Potassium K_mgL 2.10000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_20.0_201404170017 Detected and Quantified
1.90000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_20.0_201404170016 Detected and Quantified
Magnesium Mg_mgL 3.20000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_20.0_201404170016 Detected and Quantified
3.10000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_20.0_201404170017 Detected and Quantified
Sodium Na_mgL 2.49000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_20.0_201404170017 Detected and Quantified
2.60000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_20.0_201404170016 Detected and Quantified
Sulfur, sulfate (SO4) as SO4 SO4_mgL 3.60000 mg/l 2 Total Organization ActivityID=SACN_STCR_20.0_201404170017 Detected and Quantified
4.10000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_20.0_201404170016 Detected and Quantified
Silicate Si_mgL 7.64000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_20.0_201404170017 Detected and Quantified
8.50000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_20.0_201404170016 Detected and Quantified
Solids, Suspended (TSS) TSS_mgL 8.00000 mg/l 2 Non-Filterable (Particle) Organization ActivityID=SACN_STCR_20.0_201404170016 Detected and Quantified
7.90000 mg/l 2 Non-Filterable (Particle) Organization ActivityID=SACN_STCR_20.0_201404170017 Detected and Quantified
2014-07-02 Alkalinity, Total (total hydroxide+carbonate+bicarbonate) Alkalinity_mgL 67.00000 mg/l 2 Dissolved Organization ActivityID=SACN_STCR_20.0_201407020017 Detected and Quantified
69.00000 mg/l 2 Total Organization ActivityID=SACN_STCR_20.0_201407020016 Detected and Quantified
2020-07-08 Midwater Phosphorus as P P_ugL 56.33600 ug/l 2 Total Organization ActivityID=SACN_STCR_20.0_20200708000002 Detected and Quantified
53.72000 ug/l 2 Total Organization ActivityID=SACN_STCR_20.0_20200708000002 Detected and Quantified

Final dataset for SACN to bind with METC

sacn_final <- sacn |> 
  arrange(Activity_Comment) |>
  group_by(Org_Code, Park_Code, Location_ID, Location_Name, Activity_Type,
           sample_date, doy, year, month, depth_cat, depth, depth_unit,
           PARAMETER, param_name, unit, qualifier = Result_Detection_Condition, 
           censored, value_cen) |> 
  summarize(value = first(value), .groups = 'drop')

SACN Non-detect Conditions

sacn_qual <- sacn_final |> group_by(Location_ID, year, month, param_name, qualifier) |> 
  summarize(num_samps = sum(!is.na(value)), .groups = 'drop') |> 
  filter(qualifier != "Detected and Quantified")

sacn_dt <- datatable(
                sacn_qual, 
                class = 'cell-border stripe', rownames = F, width = '1200px',
                extensions = c("Buttons"),
                options = list(       
                            initComplete = htmlwidgets::JS(
                            "function(settings, json) {",
                              "$('body').css({'font-size': '11px'});",
                              "$('body').css({'font-family': 'Arial'});",
                              "$(this.api().table().header()).css({'font-size': '11px'});",
                              "$(this.api().table().header()).css({'font-family': 'Arial'});",
                            "}"),
                pageLength = 50, autoWidth = TRUE, scrollX = TRUE, scrollY = '600px',
                scrollCollapse = TRUE, lengthMenu = c(5, 10, 50, nrow(sacn_qual)),
                fixedColumns = list(leftColumns = 1),
                dom = "Blfrtip", buttons = c('copy', 'csv', 'print')),
                filter = list(position = 'top', clear = FALSE)#,
                )

Prepare METC data for binding with GLKN data

Download all data from 2006 and later from the METC’s Environmental Information Management Systems (EIMS) web server for all monitoring locations within Lake St. Croix. The data for this analysis were downloaded on 3/18/2025.

# metc data from webserver
metc <- read.csv("./data/local/2025-03-18_1359_56_MCES_EIMS_data.csv")

# name sites based on Rick's stream distance method
metc1 <- metc |> 
  mutate(Location_ID = case_when(grepl("SC0003", STATION_ID) ~ as.character("METC_STCR_0.1"),
                                 grepl("SC0233", STATION_ID) ~ as.character("METC_STCR_23.6"),
                                 grepl("SC0234", STATION_ID) ~ as.character("METC_STCR_23.4"),
                                 grepl("82000100-08", STATION_ID) ~ as.character("METC_STCR_22.5"),
                                 grepl("82000100-06", STATION_ID) ~ as.character("METC_STCR_11.3"),
                                 grepl("82000100-03", STATION_ID) ~ as.character("METC_STCR_16.6"),
                                 grepl("82000100-05", STATION_ID) ~ as.character("METC_STCR_12.4"),
                                 grepl("82000100-04", STATION_ID) ~ as.character("METC_STCR_15.3"),
                                 grepl("82000100-01", STATION_ID) ~ as.character("METC_STCR_22.3"))) |>
  filter(!is.na(Location_ID)) # UM8128 is the only station that doesn't get matched with Rick's new LocIDs

metc1$Org_Code <- "METC"
metc1$Park_Code <- "METC"
metc1$sample_date <- format(as.Date(metc1$START_DATE_TIME, format = "%m/%d/%Y %H:%M"), format = "%Y-%m-%d")
metc1$year <- as.numeric(substr(metc1$sample_date, 1, 4))
metc1$month <- as.numeric(substr(metc1$sample_date, 6, 7))
metc1$doy <- as.numeric(format(as.Date(metc1$sample_date, format = "%Y-%m-%d"), "%j"))
metc1$depth_unit = "m"
metc1$depth_cat <- ifelse(metc1$SAMPLE_DEPTH_m <= 1, "surface", NA_character_)
metc1$censored <- FALSE
metc1$value_cen <- metc1$RESULT

metc2 <- metc1 |>
  mutate(value = ifelse(PARAMETER %in% c( "Chlorophyll-a, Pheo-Corrected",
                                           "Total Nitrate/Nitrite N, Unfiltered",
                                           "Total Phosphorus, Unfiltered"),
                                           RESULT/1000,
                                           RESULT)) |>
  select(Org_Code, Park_Code, Location_ID, Location_Name = NAME, sample_date, doy, year, month,
         depth_cat, depth = SAMPLE_DEPTH_m, depth_unit, PARAMETER, value, units = UNITS, 
         qualifier = QUALIFIER, censored, value_cen)

# Join METC data with param.csv for parameter abbreviations (param_name)
metc_join <- left_join(metc2, params, by = c("PARAMETER" = "Parameter", "units" = "Units")) |>
  filter(!is.na(New_Name)) |>
  mutate(Activity_Type = "Sample") |> 
  select(Org_Code, Park_Code, Location_ID, Location_Name, Activity_Type, sample_date,
         doy, year, month, depth_cat, depth,
         depth_unit, PARAMETER, param_name = New_Name, value, unit = units, qualifier, censored, value_cen)

# Check for duplicate samples in the same site, date, depth, parameter
metc_dup <- metc_join |> group_by(Location_ID, sample_date, depth_cat, depth, param_name, qualifier) |> 
  mutate(num_samps = sum(!is.na(value))) |> filter(num_samps > 1) |> 
  arrange(Location_ID, sample_date, depth, param_name)

METC Data Qualifiers

metc_qual <- metc_join |> group_by(Location_ID, year, month, param_name, qualifier) |> 
  summarize(num_samps = sum(!is.na(value)), .groups = 'drop') |> filter(qualifier != "Valid")

qual_dt <- datatable(
                metc_qual, 
                class = 'cell-border stripe', rownames = F, width = '1200px',
                extensions = c("Buttons"),
                options = list(       
                            initComplete = htmlwidgets::JS(
                            "function(settings, json) {",
                              "$('body').css({'font-size': '11px'});",
                              "$('body').css({'font-family': 'Arial'});",
                              "$(this.api().table().header()).css({'font-size': '11px'});",
                              "$(this.api().table().header()).css({'font-family': 'Arial'});",
                            "}"),
                pageLength = 50, autoWidth = TRUE, scrollX = TRUE, scrollY = '600px',
                scrollCollapse = TRUE, lengthMenu = c(5, 10, 50, nrow(metc_qual)),
                fixedColumns = list(leftColumns = 1),
                dom = "Blfrtip", buttons = c('copy', 'csv', 'print')),
                filter = list(position = 'top', clear = FALSE)#,
                )

METC data qualifiers that are not “Valid” by site, parameter, and year

Seems there are a bunch of preliminary records in early years, kind of like how we treat accepted versus certified in our data. But seems like we should drop suspect.

qual_dt

Duplicates in the METC data

In general values are very close. Sometimes one is total and the other dissolved. Other times the dups are the same category. Going to take only the first per group, after sorting by Activity_Comment.

mdk <- kable(metc_dup, format = 'html', align = c(rep('c', 4), 'l', 'l', 'l', 'c', 'c', 'l', 'l', 'l', rep('c', 6)), 
             caption = "Duplicate measurements within the same site, date, depth, and parameter. ") |> 
       kable_styling(fixed_thead = TRUE, bootstrap_options = c('condensed'), 
                     full_width = TRUE, position = 'left', font_size = 12) |> 
       collapse_rows(1:7, valign = 'top')
Duplicate measurements within the same site, date, depth, and parameter.
Org_Code Park_Code Location_ID Location_Name Activity_Type sample_date doy year month depth_cat depth depth_unit PARAMETER param_name value unit qualifier censored value_cen num_samps
METC METC METC_STCR_0.1 St. Croix River Sample 2007-04-03 93 2007 4 surface 1 m Dissolved Oxygen DO_mgL 10.8600000 mg/L Preliminary FALSE 10.8600 2
2007 4 surface 1 m Dissolved Oxygen DO_mgL 9.4600000 mg/L Preliminary FALSE 9.4600 2
2007-11-27 331 2007 11 surface 1 m Dissolved Oxygen DO_mgL 10.6000000 mg/L Preliminary FALSE 10.6000 2
2007 11 surface 1 m Dissolved Oxygen DO_mgL 11.7400000 mg/L Preliminary FALSE 11.7400 2
2008-03-18 78 2008 3 surface 1 m Dissolved Oxygen DO_mgL 11.3400000 mg/L Valid FALSE 11.3400 2
2008 3 surface 1 m Dissolved Oxygen DO_mgL 10.6000000 mg/L Valid FALSE 10.6000 2
2008-04-01 92 2008 4 surface 1 m Dissolved Oxygen DO_mgL 12.4500000 mg/L Valid FALSE 12.4500 2
2008 4 surface 1 m Dissolved Oxygen DO_mgL 11.0500000 mg/L Valid FALSE 11.0500 2
2008-04-08 99 2008 4 surface 1 m Dissolved Oxygen DO_mgL 12.1200000 mg/L Valid FALSE 12.1200 2
2008 4 surface 1 m Dissolved Oxygen DO_mgL 11.0900000 mg/L Valid FALSE 11.0900 2
2008-04-15 106 2008 4 surface 1 m Dissolved Oxygen DO_mgL 12.5100000 mg/L Valid FALSE 12.5100 2
2008 4 surface 1 m Dissolved Oxygen DO_mgL 12.8800000 mg/L Valid FALSE 12.8800 2
2008-11-04 309 2008 11 surface 1 m Dissolved Oxygen DO_mgL 8.1400000 mg/L Valid FALSE 8.1400 2
2008 11 surface 1 m Dissolved Oxygen DO_mgL 8.1000000 mg/L Valid FALSE 8.1000 2
2008-11-18 323 2008 11 surface 1 m Dissolved Oxygen DO_mgL 9.1200000 mg/L Valid FALSE 9.1200 2
2008 11 surface 1 m Dissolved Oxygen DO_mgL 10.4100000 mg/L Valid FALSE 10.4100 2
2009-12-01 335 2009 12 surface 1 m Dissolved Oxygen DO_mgL 10.1900000 mg/L Preliminary FALSE 10.1900 2
2009 12 surface 1 m Dissolved Oxygen DO_mgL 11.1600000 mg/L Preliminary FALSE 11.1600 2
2012-03-20 80 2012 3 surface 1 m Dissolved Oxygen DO_mgL 12.5800000 mg/L Valid FALSE 12.5800 2
2012 3 surface 1 m Dissolved Oxygen DO_mgL 13.8900000 mg/L Valid FALSE 13.8900 2
2012-07-10 192 2012 7 surface 1 m Dissolved Oxygen DO_mgL 7.3500000 mg/L Valid FALSE 7.3500 2
2012 7 surface 1 m Dissolved Oxygen DO_mgL 7.4200000 mg/L Valid FALSE 7.4200 2
2012-07-17 199 2012 7 surface 1 m Dissolved Oxygen DO_mgL 6.8600000 mg/L Valid FALSE 6.8600 2
2012 7 surface 1 m Dissolved Oxygen DO_mgL 6.9500000 mg/L Valid FALSE 6.9500 2
2012-12-04 339 2012 12 surface 1 m Dissolved Oxygen DO_mgL 10.4700000 mg/L Valid FALSE 10.4700 2
2012 12 surface 1 m Dissolved Oxygen DO_mgL 11.1400000 mg/L Valid FALSE 11.1400 2
2013-03-21 80 2013 3 surface 1 m Dissolved Oxygen DO_mgL 11.5900000 mg/L Valid FALSE 11.5900 2
2013 3 surface 1 m Dissolved Oxygen DO_mgL 10.3600000 mg/L Valid FALSE 10.3600 2
2013-04-02 92 2013 4 surface 1 m Dissolved Oxygen DO_mgL 10.7900000 mg/L Valid FALSE 10.7900 2
2013 4 surface 1 m Dissolved Oxygen DO_mgL 11.4900000 mg/L Valid FALSE 11.4900 2
2013-12-03 337 2013 12 surface 1 m Dissolved Oxygen DO_mgL 10.9100000 mg/L Valid FALSE 10.9100 2
2013 12 surface 1 m Dissolved Oxygen DO_mgL 10.7400000 mg/L Valid FALSE 10.7400 2
2014-04-08 98 2014 4 surface 1 m Dissolved Oxygen DO_mgL 9.9800000 mg/L Valid FALSE 9.9800 2
2014 4 surface 1 m Dissolved Oxygen DO_mgL 10.5900000 mg/L Valid FALSE 10.5900 2
2014-11-18 322 2014 11 surface 1 m Dissolved Oxygen DO_mgL 10.3700000 mg/L Valid FALSE 10.3700 2
2014 11 surface 1 m Dissolved Oxygen DO_mgL 10.2700000 mg/L Valid FALSE 10.2700 2
2015-04-07 97 2015 4 surface 1 m Dissolved Oxygen DO_mgL 12.0000000 mg/L Valid FALSE 12.0000 2
2015 4 surface 1 m Dissolved Oxygen DO_mgL 11.9700000 mg/L Valid FALSE 11.9700 2
METC_STCR_11.3 St. Croix Lake 2016-07-31 213 2016 7 surface 0 m Secchi Depth Secchi_m 1.7500000 m Valid FALSE 1.7500 2
2016 7 surface 0 m Secchi Depth Secchi_m 1.7500000 m Valid FALSE 1.7500 2
2016 7 surface 0 m Temperature TempWater_C 26.4000000 deg C Valid FALSE 26.4000 2
2016 7 surface 0 m Temperature TempWater_C 26.4000000 deg C Valid FALSE 26.4000 2
2016-08-14 227 2016 8 surface 0 m Secchi Depth Secchi_m 1.0000000 m Valid FALSE 1.0000 2
2016 8 surface 0 m Secchi Depth Secchi_m 0.9500000 m Valid FALSE 0.9500 2
2016 8 surface 0 m Temperature TempWater_C 26.9000000 deg C Valid FALSE 26.9000 2
2016 8 surface 0 m Temperature TempWater_C 27.9000000 deg C Valid FALSE 27.9000 2
2019-06-17 168 2019 6 surface 0 m Temperature TempWater_C 21.0700000 deg C Valid FALSE 21.0700 2
2019 6 surface 0 m Temperature TempWater_C 21.2000000 deg C Valid FALSE 21.2000 2
2019-07-02 183 2019 7 surface 0 m Temperature TempWater_C 24.5000000 deg C Valid FALSE 24.5000 2
2019 7 surface 0 m Temperature TempWater_C 24.2400000 deg C Valid FALSE 24.2400 2
2019-08-19 231 2019 8 surface 0 m Temperature TempWater_C 24.1000000 deg C Valid FALSE 24.1000 2
2019 8 surface 0 m Temperature TempWater_C 24.6000000 deg C Valid FALSE 24.6000 2
2019-09-02 245 2019 9 surface 0 m Temperature TempWater_C 21.3800000 deg C Valid FALSE 21.3800 2
2019 9 surface 0 m Temperature TempWater_C 21.4000000 deg C Valid FALSE 21.4000 2
2020-05-20 141 2020 5 surface 0 m Secchi Depth Secchi_m 1.8000000 m Valid FALSE 1.8000 2
2020 5 surface 0 m Secchi Depth Secchi_m 2.4000000 m Valid FALSE 2.4000 2
2020 5 surface 0 m Temperature TempWater_C 14.4100000 deg C Valid FALSE 14.4100 2
2020 5 surface 0 m Temperature TempWater_C 14.6000000 deg C Valid FALSE 14.6000 2
2020-06-12 164 2020 6 surface 0 m Temperature TempWater_C 22.2000000 deg C Valid FALSE 22.2000 2
2020 6 surface 0 m Temperature TempWater_C 22.4000000 deg C Valid FALSE 22.4000 2
2020-07-10 192 2020 7 surface 0 m Temperature TempWater_C 27.9200000 deg C Valid FALSE 27.9200 2
2020 7 surface 0 m Temperature TempWater_C 28.5000000 deg C Valid FALSE 28.5000 2
METC_STCR_12.4 2011-09-07 250 2011 9 surface 0 m Chlorophyll-a, Pheo-Corrected ChlA_ugL 0.0000049 mg/L Valid FALSE 0.0049 2
2011 9 surface 0 m Chlorophyll-a, Pheo-Corrected ChlA_ugL 0.0000150 mg/L Valid FALSE 0.0150 2
2011 9 surface 0 m Secchi Depth Secchi_m 2.2000000 m Valid FALSE 2.2000 2
2011 9 surface 0 m Secchi Depth Secchi_m 1.6000000 m Valid FALSE 1.6000 2
2011 9 surface 0 m Temperature TempWater_C 25.5000000 deg C Valid FALSE 25.5000 2
2011 9 surface 0 m Temperature TempWater_C 24.1000000 deg C Valid FALSE 24.1000 2
METC_STCR_16.6 2016-09-01 245 2016 9 surface 0 m Secchi Depth Secchi_m 0.8500000 m Valid FALSE 0.8500 2
2016 9 surface 0 m Secchi Depth Secchi_m 1.3500000 m Valid FALSE 1.3500 2
2016 9 surface 0 m Temperature TempWater_C 23.4000000 deg C Valid FALSE 23.4000 2
2016 9 surface 0 m Temperature TempWater_C 23.9000000 deg C Valid FALSE 23.9000 2
METC_STCR_22.5 2012-06-22 174 2012 6 surface 0 m Chlorophyll-a, Pheo-Corrected ChlA_ugL 0.0000028 mg/L Valid FALSE 0.0028 2
2012 6 surface 0 m Chlorophyll-a, Pheo-Corrected ChlA_ugL 0.0000068 mg/L Valid FALSE 0.0068 2
METC_STCR_23.4 St. Croix River 2012-12-04 339 2012 12 surface 1 m Dissolved Oxygen DO_mgL 12.8900000 mg/L Valid FALSE 12.8900 2
2012 12 surface 1 m Dissolved Oxygen DO_mgL 13.1900000 mg/L Valid FALSE 13.1900 2
METC_STCR_23.6 2007-04-02 92 2007 4 surface 1 m Dissolved Oxygen DO_mgL 11.0100000 mg/L Preliminary FALSE 11.0100 2
2007 4 surface 1 m Dissolved Oxygen DO_mgL 12.2000000 mg/L Preliminary FALSE 12.2000 2
2008-03-17 77 2008 3 surface 1 m Dissolved Oxygen DO_mgL 10.2700000 mg/L Valid FALSE 10.2700 2
2008 3 surface 1 m Dissolved Oxygen DO_mgL 12.7300000 mg/L Valid FALSE 12.7300 2
2008-03-31 91 2008 3 surface 1 m Dissolved Oxygen DO_mgL 12.3200000 mg/L Valid FALSE 12.3200 2
2008 3 surface 1 m Dissolved Oxygen DO_mgL 13.6000000 mg/L Valid FALSE 13.6000 2
2008-04-07 98 2008 4 surface 1 m Dissolved Oxygen DO_mgL 12.0100000 mg/L Valid FALSE 12.0100 2
2008 4 surface 1 m Dissolved Oxygen DO_mgL 12.8600000 mg/L Valid FALSE 12.8600 2
2008-04-14 105 2008 4 surface 1 m Dissolved Oxygen DO_mgL 13.4400000 mg/L Valid FALSE 13.4400 2
2008 4 surface 1 m Dissolved Oxygen DO_mgL 12.7000000 mg/L Valid FALSE 12.7000 2
2008-11-17 322 2008 11 surface 1 m Dissolved Oxygen DO_mgL 12.5000000 mg/L Valid FALSE 12.5000 2
2008 11 surface 1 m Dissolved Oxygen DO_mgL 11.9000000 mg/L Valid FALSE 11.9000 2
2012-03-19 79 2012 3 surface 1 m Dissolved Oxygen DO_mgL 10.4000000 mg/L Valid FALSE 10.4000 2
2012 3 surface 1 m Dissolved Oxygen DO_mgL 12.2500000 mg/L Valid FALSE 12.2500 2
2012-07-10 192 2012 7 surface 1 m Dissolved Oxygen DO_mgL 7.2400000 mg/L Valid FALSE 7.2400 2
2012 7 surface 1 m Dissolved Oxygen DO_mgL 7.6600000 mg/L Valid FALSE 7.6600 2
2012-07-17 199 2012 7 surface 1 m Dissolved Oxygen DO_mgL 6.9800000 mg/L Valid FALSE 6.9800 2
2012 7 surface 1 m Dissolved Oxygen DO_mgL 7.2500000 mg/L Valid FALSE 7.2500 2
2013-04-01 91 2013 4 surface 1 m Dissolved Oxygen DO_mgL 11.3300000 mg/L Valid FALSE 11.3300 2
2013 4 surface 1 m Dissolved Oxygen DO_mgL 11.1800000 mg/L Valid FALSE 11.1800 2
2013-12-03 337 2013 12 surface 1 m Dissolved Oxygen DO_mgL 13.7200000 mg/L Valid FALSE 13.7200 2
2013 12 surface 1 m Dissolved Oxygen DO_mgL 13.3400000 mg/L Valid FALSE 13.3400 2
2014-04-08 98 2014 4 surface 1 m Dissolved Oxygen DO_mgL 11.9600000 mg/L Valid FALSE 11.9600 2
2014 4 surface 1 m Dissolved Oxygen DO_mgL 11.9900000 mg/L Valid FALSE 11.9900 2
2014-11-18 322 2014 11 surface 1 m Dissolved Oxygen DO_mgL 13.4400000 mg/L Valid FALSE 13.4400 2
2014 11 surface 1 m Dissolved Oxygen DO_mgL 13.3200000 mg/L Valid FALSE 13.3200 2
2015-04-07 97 2015 4 surface 1 m Dissolved Oxygen DO_mgL 11.0600000 mg/L Valid FALSE 11.0600 2
2015 4 surface 1 m Dissolved Oxygen DO_mgL 11.2900000 mg/L Valid FALSE 11.2900 2
metc_final <- metc_join |> 
  filter(qualifier != "Suspect") |> 
  group_by(Org_Code, Park_Code, Location_ID, Location_Name, Activity_Type,
           sample_date, doy, year, month, depth_cat, depth, depth_unit,
           PARAMETER, param_name, unit, qualifier, censored, value_cen) |> 
  summarize(value = first(value), .groups = 'drop') 

Keep only the parameters in common between GLKN and METC, then use row binding to combine the two datasets together.

# Determine parameters from METC that are on the params.csv and drop those missing.
metc_keep <- metc_final |> group_by(param_name) |> summarize(num_samps = sum(!is.na(value))) |> 
  select(param_name) |> unique() |> c()

# SACN- only keep METC params
sacn2 <- sacn_final |> filter(param_name %in% metc_keep$param_name)
#table(sacn$param_name, sacn$Location_ID)
full_dat1 <- rbind(sacn2, metc_final) |> filter(year > 2006 & year < 2025) # GLKN sites only have data from 2007 - 2024

Compile USGS discharge data

dischg <- renameNWISColumns(readNWISdv(siteNumbers = gages, parameterCd = "00060"))
gages <- c("05341550", "05344490")
dischg$Location_ID[dischg$site_no == gages[1]] <- "USGS_STCR_23.5"
dischg$Location_ID[dischg$site_no == gages[2]] <- "USGS_STCR_0.2"
dischg$month <- as.numeric(format(dischg$Date, "%m"))

dis_wide <- dischg |> select(-site_no) |>
  filter(month %in% 4:11) |> # filter only GLKN-sampled months
  pivot_wider(names_from = Location_ID, values_from = c(Flow, Flow_cd)) 

dis_wide_overlap <- dis_wide |> filter(!is.na(Flow_USGS_STCR_0.2) & !is.na(Flow_USGS_STCR_23.5))

dischg2 <- dischg |> filter(Date > as.Date("2011-09-09", format = "%Y-%m-%d"))

dp <- 
ggplot(dischg2, aes(x = Date, y = Flow, group = Location_ID, color = Location_ID)) + 
  geom_line() + theme_WQ() + labs(color = NULL) +
  scale_color_brewer(palette = "Set1")

d_lm <- 
ggplot(dis_wide_overlap, aes(x = Flow_USGS_STCR_23.5, y = Flow_USGS_STCR_0.2)) + 
  geom_point(color = 'grey', alpha = 0.4) +
  geom_smooth(method = 'lm') + theme_WQ() + geom_abline(slope = 1, intercept = 0)

Plot of discharge for overlapping sample periods of the stream gages.

Check of how well gage at 23.5 could predict discharge at 0.2. Error increases a bit for larger flows, but there’s not a ton of data in the higher ranges and increase in variance is not terrible. The SACN_STCR_15.8 site is between the two gages, and I’m wondering if I can interpolate discharge for that site based on the other two gages.

Check on flow rating. Most are A, a few P (provisional). The ‘e’ refers to estimated.

table(dischg$Location_ID, dischg$Flow_cd)
##                 
##                     A
##   USGS_STCR_0.2  5251
##   USGS_STCR_23.5 2958
##                 
##                   A e
##   USGS_STCR_0.2   644
##   USGS_STCR_23.5 1456
##                 
##                     P
##   USGS_STCR_0.2   465
##   USGS_STCR_23.5  371
##                 
##                   P e
##   USGS_STCR_0.2     3
##   USGS_STCR_23.5   81
##                 
##                  P Ice
##   USGS_STCR_0.2     75
##   USGS_STCR_23.5   108

Correlation in discharge between the two sites

round(cor(dis_wide_overlap[,c("Flow_USGS_STCR_23.5", "Flow_USGS_STCR_0.2")]), 2)[1,2] # 0.96
## [1] 0.96
mod <- lm(Flow_USGS_STCR_0.2 ~ Flow_USGS_STCR_23.5, data = dis_wide_overlap)
summary(mod)
## 
## Call:
## lm(formula = Flow_USGS_STCR_0.2 ~ Flow_USGS_STCR_23.5, data = dis_wide_overlap)
## 
## Residuals:
##      Min 
## -20364.5 
##       1Q 
##   -883.4 
##   Median 
##    -95.9 
##       3Q 
##    877.2 
##      Max 
##   9112.9 
## 
## Coefficients:
##                       Estimate
## (Intercept)         834.260334
## Flow_USGS_STCR_23.5   0.963512
##                     Std. Error
## (Intercept)          45.786122
## Flow_USGS_STCR_23.5   0.005074
##                     t value
## (Intercept)           18.22
## Flow_USGS_STCR_23.5  189.90
##                     Pr(>|t|)
## (Intercept)           <2e-16
## Flow_USGS_STCR_23.5   <2e-16
##                        
## (Intercept)         ***
## Flow_USGS_STCR_23.5 ***
## ---
## Signif. codes:  
##   0  '***'
##   0.001  '**'
##   0.01  '*'
##   0.05  '.'
##   0.1 '  ' 1
## 
## Residual standard error: 1672 on 3271 degrees of freedom
## Multiple R-squared:  0.9168, Adjusted R-squared:  0.9168 
## F-statistic: 3.606e+04 on 1 and 3271 DF,  p-value: < 2.2e-16

Sampling Intensity

Code

full_dat2 <- full_dat1 |> filter(!Location_ID %in% "METC_STCR_23.4") # 23.4 has a very narrow period of record. Dropping

full_dat2$site_abbr <- factor(gsub("_STCR", "", full_dat2$Location_ID),
                              levels = c("METC_23.6", #"METC_23.4", 
                                         "METC_22.5", "METC_22.3",
                                         "SACN_20.0", "METC_16.6", "SACN_15.8", "METC_15.3",
                                         "METC_12.4", "METC_11.3", "SACN_2.0", "METC_0.1"))

full_dat2$site_order <- factor(full_dat2$Location_ID, 
                               levels = c("METC_STCR_23.6", #"METC_STCR_23.4", 
                                          "METC_STCR_22.5", "METC_STCR_22.3",
                                          "SACN_STCR_20.0", "METC_STCR_16.6", "SACN_STCR_15.8", "METC_STCR_15.3",
                                          "METC_STCR_12.4",  "METC_STCR_11.3", "SACN_STCR_2.0", "METC_STCR_0.1"))
full_dat2$sample_date <- as.Date(full_dat2$sample_date, format = "%Y-%m-%d")

# Add discharge as columns, so they can be modeled as covariates
full_dat3 <- left_join(full_dat2, 
                      dis_wide |> select(Date, Flow_USGS_STCR_0.2, Flow_USGS_STCR_23.5, 
                                         Flow_cd_USGS_STCR_0.2, Flow_cd_USGS_STCR_23.5), 
                      by = c("sample_date" = "Date"))

# Add water temperature as a column, so it can be modeled as a covariate. Because there are occasionally
# duplicate water temps in the METC data, but they're always within a few tenths of a degree, taking 
# the average of the two measurements. 
wtemp <- full_dat3 |> filter(param_name == "TempWater_C") |> 
  select(Location_ID, sample_date, depth_cat, depth, WaterTemp_C = value) |> 
  filter(!is.na(WaterTemp_C)) |> 
  group_by(Location_ID, sample_date, depth_cat, depth) |> 
  summarize(WaterTemp_C = mean(WaterTemp_C), .groups = 'drop')
full_dat <- left_join(full_dat3, wtemp, by = c("Location_ID", "sample_date", "depth", "depth_cat")) 

param_site_year_qual <- full_dat |> 
  group_by(Site = site_abbr, Parameter = param_name, Year = year, Org_Code, qualifier) |> 
  summarize(Num_Samps = sum(!is.na(param_name)), .groups = 'drop') |> 
  arrange(Site, Parameter, Year, qualifier)

# params_site_year <- full_dat |> group_by(Site = site_abbr, Parameter = param_name, Year = year, Org_Code) |> 
#   summarize(Num_Samps = sum(!is.na(value)),
#             .groups = 'drop') |> 
#   arrange(Site, Parameter, Year)

params_site_year_wide <- param_site_year_qual |> #filter(Num_Samps > 0) |> 
  pivot_wider(names_from = Site, values_from = Num_Samps) |> 
  arrange(Parameter, Year)

cols <- names(params_site_year_wide[,6:ncol(params_site_year_wide)])
params_site_year_wide[,cols][params_site_year_wide[,cols] == 0] <- NA_real_

dischg$month <- as.numeric(format(dischg$Date, "%m"))
dischg$year <- as.numeric(format(dischg$Date, "%Y"))

dis_yrm <- dischg |> 
  #filter(month %in% 4:11) |> 
  group_by(Location_ID, month, year) |> 
  summarize(Num_Dis = sum(!is.na(Flow)),
            .groups = 'drop') 

dis_yr <- dischg |> 
  #filter(month %in% 4:11) |> 
  group_by(Location_ID, year) |> 
  summarize(Num_Dis = sum(!is.na(Flow)),
            .groups = 'drop') 

# sampling matrix 
psy_tab <- kable(params_site_year_wide, format = 'html', align = 'c', 
  cap = "Sampling matrix by site for every parameter and year. Values are number of non-QAQC samples within a year. Site codes start with 'METC' for Metropolitan Councial sites, and 'SACN' for GLKN-monitored sites. Numbers indicate the stream miles.") |> 
  kable_styling(fixed_thead = T, bootstrap_options = c("condensed"), full_width = F) |> 
  column_spec(1:ncol(params_site_year_wide), border_left = T, border_right = T) |> 
  collapse_rows(1, valign = 'top') 

params_site_year1 <- param_site_year_qual |> 
  mutate(num_samps_bin = case_when(between(Num_Samps, 1, 5) ~ 1,
                                   between(Num_Samps, 6, 10) ~ 2,
                                   between(Num_Samps, 11, 15) ~ 3,
                                   between(Num_Samps, 16, 20) ~ 4,
                                   Num_Samps > 20 ~ 5),
         qual_simp = case_when(qualifier %in% c("Valid", "Preliminary") ~ "Detected and Quantified",
                               qualifier == "Suspect" ~ "Not Reported",
                               TRUE ~ qualifier)) |> 
 filter(Num_Samps > 0)

# Create heat map of overall sampling intensity
psy_plot <- 
  ggplot(params_site_year1, aes(x = Site, y = Year)) +
    geom_tile(aes(fill = num_samps_bin), color = 'grey') + facet_wrap(~Parameter, ncol = 3) +
    scale_fill_distiller(palette = "Spectral", guide = "legend", name = "Num. Sample Bins", 
                         labels = c("Bin 1-5", "Bin 6-10", "Bin 11-15", "Bin 16-20", "Bin > 20")) +
    theme(axis.text.x = element_text(angle = 90, hjust = 0, vjust = 0.5)) +
    waterGLKN::theme_WQ() +
    labs(x = NULL) +
    scale_y_continuous(breaks = c(2006, 2010, 2014, 2018, 2022))
# 
# Plotting function to iterate through each parameter detects
param_detect_heatmap <- function(dat = params_site_year1, param = NA){
  p <-
    ggplot(dat, aes(x = Site, y = Year)) +
    geom_tile(aes(fill = Num_Samps), color = 'grey') +
    facet_wrap(~qual_simp)+
    scale_fill_distiller(palette = "Spectral", guide = 'legend', 
                         name = "# samples") +
    theme(axis.text.x = element_text(angle = 90, hjust = 0, vjust = 0.5)) +
    labs(x = NULL, y = NULL) +
    theme_bw()

    return(p)
}

dis_ym_plot <- 
  ggplot(dis_yrm, aes(x = Location_ID, y = month)) +
  geom_tile(aes(fill = Num_Dis), color = 'grey') + 
  facet_grid(rows = vars(year)) +
  scale_fill_distiller(palette = "Spectral", guide = 'legend', 
                       name = "# samples") +
  theme(axis.text.x = element_text(angle = 90, hjust = 0, vjust = 0.5)) +
  labs(x = NULL, y = NULL) +
  scale_y_continuous(breaks = seq(1, 12, 2), labels = c("Jan", "Mar", "May", "Jul", "Sep", "Nov"))

params_site_year_month <- full_dat |> group_by(Site = site_abbr, param_name, year, month) |> 
  summarize(num_samps = sum(!is.na(value)), .groups = 'drop') |> filter(num_samps > 0)

# Plotting function to iterate through each parameter
# param_heatmap <- function(dat = param_site_year_month, param = NA){
#   p <-
#     ggplot(dat, aes(x = Site, y = month)) +
#     geom_tile(aes(fill = num_samps), color = 'grey') +
#     facet_grid(rows = vars(year), cols = vars(qual_simp))+
#     scale_fill_distiller(palette = "Spectral", guide = 'legend', 
#                          name = "# samples") +
#     theme(axis.text.x = element_text(angle = 90, hjust = 0, vjust = 0.5)) +
#     labs(x = NULL, y = NULL) +
#     scale_y_reverse(breaks = seq(1, 12, 2), labels = c("Jan", "Mar", "May", "Jul", "Sep", "Nov"))
# 
#     return(p)
# }

param_detect_heatmap <- function(dat = params_site_year1, param = NA){
  p <-
    ggplot(dat, aes(x = Site, y = Year)) +
    geom_tile(aes(fill = factor(num_samps_bin)), color = 'grey') +
    facet_wrap(~factor(qual_simp), ncol = 4) +
    scale_fill_manual(guide = 'legend', 
                      values = c("#067bc2", "#80e377", "#ecc30b", "#f37748", "#d56062"),
                      name = "Binned # of samples",
                      breaks = 1:5,
                      labels = c("1-5", "6-10", "11-15", "16-20", ">20"), 
                      drop = F) +
    theme_bw() +
    labs(x = NULL, y = NULL) +
    scale_y_continuous(breaks = seq(2006, 2024, 2)) +
    theme(axis.text.x = element_text(angle = 90, hjust = 0, vjust = 0.5)) 

    return(p)
}
param_list <- sort(unique(full_dat$param_name))

Sampling Matrix

Sampling matrix by site for every parameter and year. Values are number of non-QAQC samples within a year. Site codes start with ‘METC’ for Metropolitan Councial sites, and ‘SACN’ for GLKN-monitored sites. Numbers indicate the stream miles.
Parameter Year Org_Code qualifier METC_23.6 METC_22.5 METC_22.3 SACN_20.0 METC_16.6 SACN_15.8 METC_15.3 METC_12.4 METC_11.3 SACN_2.0 METC_0.1
Alkalinity_mgL 2007 GLKN Detected and Quantified 3 3 3
2008 GLKN Detected and Quantified 3 3 6
2009 GLKN Detected and Quantified 3 3 3
2010 GLKN Detected and Quantified 3 3 3
2011 GLKN Detected and Quantified 3 3 3
2012 GLKN Detected and Quantified 3 3 3
2013 GLKN Detected and Quantified 3 3 3
2014 GLKN Detected and Quantified 5 5 5
2015 GLKN Detected and Quantified 3 3 3
2016 METC Valid 1 2
2016 GLKN Detected and Quantified 3 3 3
2017 METC Valid 24 25
2017 GLKN Detected and Quantified 3 3 3
2018 METC Valid 26 26
2018 GLKN Detected and Quantified 3 3 3
2019 METC Valid 12 12
2019 GLKN Detected and Quantified 3 3 3
2020 METC Preliminary 1 1
2020 METC Valid 3 3
2020 GLKN Detected and Quantified 3 3 3
2021 METC Valid 4 4
2021 GLKN Detected and Quantified 2 2 2
2021 GLKN Present Below Quantification Limit 1 1 1
2022 METC Valid 4 4
2022 GLKN Detected and Quantified 2 2 2
2023 METC Valid 4 4
2023 GLKN Detected and Quantified 2 2 2
2023 GLKN Present Below Quantification Limit 1 1 1
2024 METC Valid 4 4
2024 GLKN Detected and Quantified 1 1 1
2024 GLKN Present Below Quantification Limit 2 2 2
Ca_mgL 2007 METC Preliminary 22 22
2007 GLKN Detected and Quantified 3 3 3
2008 METC Preliminary 12 12
2008 GLKN Detected and Quantified 3 3 6
2009 METC Preliminary 12 12
2009 GLKN Detected and Quantified 3 3 3
2010 METC Preliminary 16 16
2010 GLKN Detected and Quantified 3 3 3
2011 METC Preliminary 23 23
2011 GLKN Detected and Quantified 3 3 3
2012 METC Preliminary 18 22
2012 GLKN Detected and Quantified 3 3 3
2013 METC Preliminary 23 23
2013 GLKN Detected and Quantified 3 3 3
2014 METC Preliminary 24 24
2014 GLKN Detected and Quantified 4 4 4
2015 METC Valid 25 24
2015 GLKN Detected and Quantified 3 3 3
2016 METC Valid 27 27
2016 GLKN Detected and Quantified 3 3 3
2017 METC Valid 24 24
2017 GLKN Detected and Quantified 3 3 3
2018 METC Valid 26 26
2018 GLKN Detected and Quantified 3 3 3
2019 METC Valid 12 12
2019 GLKN Detected and Quantified 3 3 3
2020 METC Valid 1 1
2020 GLKN Detected and Quantified 3 3 3
2021 GLKN Detected and Quantified 3 3 3
2022 GLKN Detected and Quantified 2 2 2
2023 GLKN Detected and Quantified 3 3 3
2024 GLKN Detected and Quantified 3 3 3
ChlA_ugL 2007 METC Valid 43 3 9 7 10 42
2008 METC Valid 41 3 10 8 9 40
2008 GLKN Detected and Quantified 3 3 6
2009 METC Valid 42 8 11 3 13 14 41
2009 GLKN Detected and Quantified 8 8 8
2010 METC Valid 43 12 12 6 15 16 43
2010 GLKN Detected and Quantified 3 3 3
2011 METC Valid 42 11 15 6 10 16 43
2011 GLKN Detected and Quantified 8 8 8
2012 METC Valid 34 11 3 16 6 6 16 41
2012 GLKN Detected and Quantified 3 3 3
2013 METC Valid 42 12 13 7 6 17 42
2013 GLKN Detected and Quantified 7 7 7
2014 METC Valid 42 8 11 4 4 15 41
2014 GLKN Detected and Quantified 8 8 8
2015 METC Valid 42 5 4 5 5 16 41
2015 GLKN Detected and Quantified 8 8 8
2016 METC Valid 41 42
2016 GLKN Detected and Quantified 8 8 8
2017 METC Preliminary 3 3
2017 METC Valid 39 8 8 7 5 16 39
2017 GLKN Detected and Quantified 7 8 8
2017 GLKN Not Reported 1
2018 METC Valid 44 8 5 5 10 15 41
2018 GLKN Detected and Quantified 7 7 7
2019 METC Valid 39 5 1 2 10 15 37
2019 GLKN Detected and Quantified 8 8 8
2020 METC Valid 35 8 15 35
2020 GLKN Detected and Quantified 6 6 6
2021 METC Valid 43 9 7 7 41
2021 GLKN Detected and Quantified 6 7 5
2021 GLKN Present Below Quantification Limit 1 2
2022 METC Valid 41 10 4 8 42
2022 GLKN Detected and Quantified 6 5 5
2022 GLKN Present Below Quantification Limit 1 2 2
2023 METC Valid 43 7 3 4 40
2023 GLKN Detected and Quantified 7 8 7
2023 GLKN Present Below Quantification Limit 1
2024 METC Valid 43 9 2 6 42
2024 GLKN Detected and Quantified 6 5 6
2024 GLKN Not Detected 1 1 2
Cl_mgL 2007 METC Preliminary 41 38
2007 GLKN Detected and Quantified 3 3 3
2008 METC Preliminary 1
2008 METC Valid 39 39
2008 GLKN Detected and Quantified 3 3 6
2009 METC Valid 43 42
2009 GLKN Detected and Quantified 3 3 3
2010 METC Valid 38 40
2010 GLKN Detected and Quantified 3 3 3
2011 METC Preliminary 42 40
2011 GLKN Detected and Quantified 3 3 3
2012 METC Valid 34 42
2012 GLKN Detected and Quantified 3 3 3
2013 METC Valid 42 42
2013 GLKN Detected and Quantified 2 2 2
2013 GLKN Present Below Quantification Limit 1 1 1
2014 METC Valid 42 42
2014 GLKN Detected and Quantified 4 4 4
2015 METC Valid 42 41
2015 GLKN Detected and Quantified 3 3 3
2016 METC Valid 42 41
2016 GLKN Detected and Quantified 3 3 3
2017 METC Valid 42 41
2017 GLKN Detected and Quantified 3 3 3
2018 METC Valid 44 42
2018 GLKN Detected and Quantified 3 3 3
2019 METC Valid 44 42
2019 GLKN Detected and Quantified 3 3 3
2020 METC Preliminary 1 1
2020 METC Valid 35 34
2020 GLKN Detected and Quantified 3 3 3
2021 METC Valid 43 42
2021 GLKN Detected and Quantified 3 3 3
2022 METC Valid 42 42
2022 GLKN Detected and Quantified 2 2 2
2023 METC Valid 43 42
2023 GLKN Detected and Quantified 2 2
2023 GLKN Present Below Quantification Limit 1 1 3
2024 METC Valid 43 42
2024 GLKN Detected and Quantified 3 3 3
DOC_mgL 2007 GLKN Detected and Quantified 3 3 3
2008 GLKN Detected and Quantified 3 3 6
2009 GLKN Detected and Quantified 3 3 3
2010 GLKN Detected and Quantified 3 3 3
2011 GLKN Detected and Quantified 3 3 3
2012 GLKN Detected and Quantified 3 3 3
2013 GLKN Detected and Quantified 3 3 3
2014 GLKN Detected and Quantified 3 3 4
2014 GLKN Present Below Quantification Limit 1 1
2015 GLKN Detected and Quantified 3 3 3
2016 GLKN Detected and Quantified 3 3 3
2017 METC Valid 4 4
2017 GLKN Detected and Quantified 3 3 3
2018 METC Valid 4 4
2018 GLKN Detected and Quantified 3 3 3
2019 METC Valid 5 5
2019 GLKN Detected and Quantified 3 3 3
2020 METC Valid 3 3
2020 GLKN Detected and Quantified 3 3 3
2021 METC Valid 4 4
2021 GLKN Detected and Quantified 3 3 3
2022 METC Valid 4 4
2022 GLKN Detected and Quantified 2 2 2
2023 METC Valid 4 4
2023 GLKN Detected and Quantified 2 3 3
2023 GLKN Present Below Quantification Limit 1
2024 METC Valid 4 4
2024 GLKN Detected and Quantified 3 3 3
DO_mgL 2007 METC Preliminary 45 47
2007 GLKN Detected and Quantified 88 71 123
2008 METC Preliminary 1
2008 METC Valid 45 47
2008 GLKN Detected and Quantified 34 21 94
2009 METC Preliminary 12 11
2009 METC Valid 33 34
2009 GLKN Detected and Quantified 91 72 123
2010 METC Preliminary 11 11
2010 METC Valid 31 35
2010 GLKN Detected and Quantified 38 33 46
2011 METC Preliminary 20 13
2011 METC Valid 26 31
2011 GLKN Detected and Quantified 98 67 130
2012 METC Valid 36 45
2012 GLKN Detected and Quantified 33 26 46
2013 METC Valid 44 45
2013 GLKN Detected and Quantified 82 71 112
2014 METC Valid 44 44
2014 GLKN Detected and Quantified 104 83 131
2015 METC Valid 43 43
2015 GLKN Detected and Quantified 90 72 123
2016 METC Valid 41 42
2016 GLKN Detected and Quantified 91 84 125
2017 METC Valid 42 42
2017 GLKN Detected and Quantified 94 79 126
2018 METC Valid 44 41
2018 GLKN Detected and Quantified 83 74 113
2019 METC Valid 44 95 42
2019 GLKN Detected and Quantified 102 82 136
2020 METC Valid 37 73 36
2020 GLKN Detected and Quantified 66 62 89
2021 METC Valid 43 42
2021 GLKN Detected and Quantified 78 65 104
2022 METC Valid 42 42
2022 GLKN Detected and Quantified 82 77 113
2023 METC Valid 43 42
2023 GLKN Detected and Quantified 70 71 110
2024 METC Valid 43 42
2024 GLKN Detected and Quantified 71 60 112
DOsat_pct 2007 GLKN Detected and Quantified 88 71 123
2008 GLKN Detected and Quantified 34 21 94
2009 GLKN Detected and Quantified 91 72 123
2010 GLKN Detected and Quantified 38 33 46
2011 GLKN Detected and Quantified 98 67 130
2012 GLKN Detected and Quantified 33 26 46
2013 GLKN Detected and Quantified 82 71 112
2014 GLKN Detected and Quantified 104 83 131
2015 GLKN Detected and Quantified 90 72 123
2016 GLKN Detected and Quantified 91 84 125
2017 GLKN Detected and Quantified 94 79 126
2018 GLKN Detected and Quantified 83 74 113
2019 GLKN Detected and Quantified 102 82 136
2019 METC Valid 95
2020 GLKN Detected and Quantified 66 62 89
2020 METC Valid 73
2021 GLKN Detected and Quantified 78 65 104
2022 GLKN Detected and Quantified 82 77 113
2023 GLKN Detected and Quantified 70 71 110
2024 GLKN Detected and Quantified 71 60 112
K_mgL 2007 METC Preliminary 22 22
2007 GLKN Detected and Quantified 3 3 3
2008 METC Preliminary 12 12
2008 GLKN Detected and Quantified 3 3 6
2009 METC Preliminary 12 12
2009 GLKN Detected and Quantified 3 3 3
2010 METC Preliminary 16 16
2010 GLKN Detected and Quantified 3 3 3
2011 METC Preliminary 23 23
2011 GLKN Detected and Quantified 3 3 3
2012 METC Preliminary 18 22
2012 GLKN Detected and Quantified 3 3 3
2013 METC Preliminary 23 23
2013 GLKN Detected and Quantified 3 3 3
2014 METC Preliminary 24 24
2014 GLKN Detected and Quantified 4 4 4
2015 METC Valid 25 24
2015 GLKN Detected and Quantified 3 3 3
2016 METC Valid 27 27
2016 GLKN Detected and Quantified 3 3 3
2017 METC Valid 24 24
2017 GLKN Detected and Quantified 3 3 3
2018 METC Valid 26 26
2018 GLKN Detected and Quantified 3 3 3
2019 METC Valid 12 12
2019 GLKN Detected and Quantified 3 3 3
2020 METC Valid 1 1
2020 GLKN Detected and Quantified 3 3 3
2021 GLKN Detected and Quantified 3 3 3
2022 GLKN Detected and Quantified 2 2 2
2023 GLKN Detected and Quantified 3 3 3
2024 GLKN Detected and Quantified 3 3 3
Mg_mgL 2007 METC Preliminary 22 22
2007 GLKN Detected and Quantified 3 3 3
2008 METC Preliminary 12 12
2008 GLKN Detected and Quantified 3 3 6
2009 METC Preliminary 12 12
2009 GLKN Detected and Quantified 3 3 3
2010 METC Preliminary 16 16
2010 GLKN Detected and Quantified 3 3 3
2011 METC Preliminary 23 23
2011 GLKN Detected and Quantified 3 3 3
2012 METC Preliminary 18 22
2012 GLKN Detected and Quantified 3 3 3
2013 METC Preliminary 23 23
2013 GLKN Detected and Quantified 3 3 3
2014 METC Preliminary 24 24
2014 GLKN Detected and Quantified 4 4 4
2015 METC Valid 25 24
2015 GLKN Detected and Quantified 3 3 3
2016 METC Valid 27 27
2016 GLKN Detected and Quantified 3 3 3
2017 METC Valid 24 24
2017 GLKN Detected and Quantified 3 3 3
2018 METC Valid 26 26
2018 GLKN Detected and Quantified 3 3 3
2019 METC Valid 12 12
2019 GLKN Detected and Quantified 3 3 3
2020 METC Valid 1 1
2020 GLKN Detected and Quantified 3 3 3
2021 GLKN Detected and Quantified 3 3 3
2022 GLKN Detected and Quantified 2 2 2
2023 GLKN Detected and Quantified 3 3 3
2024 GLKN Detected and Quantified 3 3 3
NH4_ugL 2007 METC Preliminary 26 26
2007 GLKN Detected and Quantified 8 8 8
2008 METC Preliminary 25 26
2008 GLKN Detected and Quantified 3 3 6
2009 METC Valid 28 27
2009 GLKN Detected and Quantified 2 3 3
2009 GLKN Not Detected 5 5 4
2009 GLKN Present Below Quantification Limit 1 1
2010 METC Valid 23 24
2010 GLKN Detected and Quantified 2 2 1
2010 GLKN Present Below Quantification Limit 1 1 1
2010 GLKN Not Detected 1
2011 METC Preliminary 27 27
2011 GLKN Detected and Quantified 6 6 7
2011 GLKN Present Below Quantification Limit 2 2
2011 GLKN Not Detected 1
2012 METC Valid 21 24
2012 GLKN Detected and Quantified 3 1 2
2012 GLKN Present Below Quantification Limit 2 1
2013 METC Valid 25 25
2013 GLKN Detected and Quantified 7 7 6
2013 GLKN Present Below Quantification Limit 1
2014 METC Valid 9 9
2014 GLKN Detected and Quantified 7 8 7
2014 GLKN Present Below Quantification Limit 1 1
2015 GLKN Detected and Quantified 8 7 5
2015 GLKN Present Below Quantification Limit 1 3
2016 GLKN Detected and Quantified 8 8 7
2016 GLKN Present Below Quantification Limit 1
2017 GLKN Detected and Quantified 5 6 5
2017 GLKN Present Below Quantification Limit 3 2 3
2018 GLKN Detected and Quantified 6 6 4
2018 GLKN Present Below Quantification Limit 1 1 2
2018 GLKN Not Detected 1
2019 GLKN Detected and Quantified 7 8 7
2019 GLKN Present Below Quantification Limit 1 1
2020 GLKN Detected and Quantified 6 5 5
2020 GLKN Present Below Quantification Limit 1 1
2021 GLKN Detected and Quantified 2 7 4
2021 GLKN Present Below Quantification Limit 5 1
2021 GLKN Not Detected 2
2022 GLKN Detected and Quantified 6 7 5
2022 GLKN Present Below Quantification Limit 1
2022 GLKN Not Detected 2
2023 GLKN Detected and Quantified 3 5 5
2023 GLKN Not Detected 2 2 2
2023 GLKN Present Below Quantification Limit 2 1 1
2024 GLKN Detected and Quantified 6 6 7
2024 GLKN Not Detected 1 1
NO2+NO3_ugL 2007 GLKN Detected and Quantified 8 8 8
2008 GLKN Detected and Quantified 3 3 6
2009 GLKN Detected and Quantified 8 8 8
2010 GLKN Detected and Quantified 3 3 3
2011 GLKN Detected and Quantified 8 8 8
2012 GLKN Detected and Quantified 3 3 3
2013 GLKN Detected and Quantified 7 7 7
2014 GLKN Detected and Quantified 8 8 8
2015 GLKN Detected and Quantified 8 8 8
2016 GLKN Detected and Quantified 8 8 8
2017 GLKN Detected and Quantified 8 8 8
2018 METC Valid 22 20
2018 GLKN Detected and Quantified 7 7 7
2019 METC Valid 44 42
2019 GLKN Detected and Quantified 8 8 8
2020 METC Preliminary 1 1
2020 METC Valid 35 34
2020 GLKN Detected and Quantified 6 6 6
2021 METC Valid 43 41
2021 GLKN Detected and Quantified 7 7 7
2021 METC Preliminary 1
2022 METC Valid 42 42
2022 GLKN Detected and Quantified 7 7 7
2023 METC Valid 43 41
2023 GLKN Detected and Quantified 7 8 8
2024 METC Valid 42 42
2024 GLKN Detected and Quantified 6 6 7
2024 GLKN Not Detected 1 1
Na_mgL 2007 METC Preliminary 22 22
2007 GLKN Detected and Quantified 3 3 3
2008 METC Preliminary 12 12
2008 GLKN Detected and Quantified 3 3 6
2009 METC Preliminary 12 12
2009 GLKN Detected and Quantified 3 3 3
2010 METC Preliminary 16 16
2010 GLKN Detected and Quantified 3 3 3
2011 METC Preliminary 23 23
2011 GLKN Detected and Quantified 3 3 3
2012 METC Preliminary 18 22
2012 GLKN Detected and Quantified 3 3 3
2013 METC Preliminary 23 23
2013 GLKN Detected and Quantified 2 2 2
2013 GLKN Not Detected 1 1 1
2014 METC Preliminary 24 24
2014 GLKN Detected and Quantified 4 4 4
2015 METC Valid 25 24
2015 GLKN Detected and Quantified 3 3 3
2016 METC Valid 27 27
2016 GLKN Detected and Quantified 3 3 3
2017 METC Valid 24 24
2017 GLKN Detected and Quantified 3 3 3
2018 METC Valid 26 26
2018 GLKN Detected and Quantified 3 3 3
2019 METC Valid 12 12
2019 GLKN Detected and Quantified 3 3 3
2020 METC Valid 1 1
2020 GLKN Detected and Quantified 3 3 3
2021 GLKN Detected and Quantified 3 3 3
2022 GLKN Detected and Quantified 2 2 2
2023 GLKN Detected and Quantified 3 3 3
2024 GLKN Detected and Quantified 3 3 3
P_ugL 2007 METC Preliminary 40 38
2007 GLKN Detected and Quantified 9 8 9
2008 METC Valid 40
2008 GLKN Detected and Quantified 4 3 7
2008 METC Preliminary 39
2009 METC Valid 40
2009 GLKN Detected and Quantified 9 8 9
2009 METC Preliminary 42
2010 METC Valid 42
2010 GLKN Detected and Quantified 4 3 4
2010 METC Preliminary 42
2011 METC Preliminary 41 41
2011 GLKN Detected and Quantified 9 8 9
2012 METC Valid 34 42
2012 GLKN Detected and Quantified 4 3 4
2013 METC Valid 42 40
2013 GLKN Detected and Quantified 8 7 8
2014 METC Valid 41 41
2014 GLKN Detected and Quantified 8 8 9
2015 METC Valid 42 41
2015 GLKN Detected and Quantified 9 8 9
2016 METC Valid 41 42
2016 GLKN Detected and Quantified 9 8 9
2017 METC Valid 42 41
2017 GLKN Detected and Quantified 9 8 9
2018 METC Valid 43 42
2018 GLKN Detected and Quantified 8 7 8
2019 METC Valid 44 42
2019 GLKN Detected and Quantified 9 8 9
2020 METC Valid 37 35
2020 GLKN Detected and Quantified 7 6 7
2020 METC Preliminary 1
2021 METC Valid 43 42
2021 GLKN Detected and Quantified 8 7 8
2022 METC Valid 42 42
2022 GLKN Detected and Quantified 8 7 8
2023 METC Valid 43 42
2023 GLKN Detected and Quantified 8 8 9
2024 METC Valid 14 13
2024 GLKN Detected and Quantified 7 6 8
SO4_mgL 2007 METC Preliminary 22 22
2007 GLKN Detected and Quantified 3 3 3
2008 METC Preliminary 13 12
2008 GLKN Detected and Quantified 3 3 6
2009 METC Preliminary 12 12
2009 GLKN Detected and Quantified 3 3 3
2010 METC Preliminary 16 16
2010 GLKN Detected and Quantified 3 3 3
2011 METC Preliminary 22 23
2011 GLKN Detected and Quantified 3 3 3
2012 METC Preliminary 18 22
2012 GLKN Detected and Quantified 3 3 3
2013 METC Preliminary 22 22
2013 GLKN Detected and Quantified 3 3 3
2014 METC Preliminary 23 24
2014 GLKN Detected and Quantified 4 4 4
2015 METC Valid 25 24
2015 GLKN Detected and Quantified 3 3 3
2016 METC Valid 27 27
2016 GLKN Detected and Quantified 3 3 3
2017 METC Valid 24 24
2017 GLKN Detected and Quantified 3 3 3
2018 METC Valid 26 25
2018 GLKN Detected and Quantified 3 3 3
2019 METC Valid 12 12
2019 GLKN Detected and Quantified 3 3 3
2020 METC Valid 4 4
2020 GLKN Detected and Quantified 3 3 3
2021 METC Valid 4 4
2021 GLKN Detected and Quantified 3 3 3
2022 METC Valid 4 4
2022 GLKN Detected and Quantified 2 2 2
2023 METC Valid 4 4
2023 GLKN Detected and Quantified 3 3 3
2024 METC Valid 27 26
2024 GLKN Detected and Quantified 3 3 3
Secchi_m 2007 METC Valid 3 9 7 10
2007 GLKN Detected and Quantified 8 8 7
2007 GLKN Not Reported 1
2008 METC Valid 3 10 8 9
2008 GLKN Detected and Quantified 3 3 6
2009 METC Valid 8 11 3 13 14
2009 GLKN Detected and Quantified 8 8 8
2010 METC Preliminary 1
2010 METC Valid 12 12 6 15 16
2010 GLKN Detected and Quantified 2 2 2
2011 METC Valid 11 15 6 10 16
2011 GLKN Detected and Quantified 8 7 7
2011 GLKN Not Reported 1 1
2012 METC Valid 10 3 16 6 6 16
2012 GLKN Detected and Quantified 3 3 3
2013 METC Valid 11 13 7 6 17 2
2013 GLKN Detected and Quantified 7 7 7
2014 METC Valid 8 12 4 4 15 2
2014 GLKN Detected and Quantified 7 7 8
2014 GLKN Not Reported 1 1
2015 METC Valid 4 4 5 5 14 2
2015 GLKN Detected and Quantified 8 8 8
2016 METC Valid 3 3 3 4 16
2016 GLKN Detected and Quantified 8 8 8
2017 METC Valid 8 8 8 5 16
2017 GLKN Detected and Quantified 8 8 8
2018 METC Valid 8 5 5 10 14
2018 GLKN Detected and Quantified 7 6 6
2018 GLKN Not Reported 1 1
2018 METC Preliminary 1
2019 METC Valid 5 1 2 10 15
2019 GLKN Detected and Quantified 8 8 8
2020 METC Valid 9 16
2020 GLKN Detected and Quantified 5 6 6
2020 GLKN Not Reported 1
2021 METC Valid 9 7 8
2021 GLKN Detected and Quantified 7 6 6
2021 GLKN Not Reported 1 1
2022 METC Valid 10 4 8
2022 GLKN Detected and Quantified 7 7 6
2023 METC Valid 7 3 4
2023 GLKN Detected and Quantified 6 7 7
2024 METC Valid 9 2 6
2024 GLKN Detected and Quantified 5 5 6
Si_mgL 2007 METC Preliminary 41 37
2007 GLKN Detected and Quantified 3 3 3
2008 METC Preliminary 41 41
2008 GLKN Detected and Quantified 3 3 6
2009 METC Preliminary 42 42
2009 GLKN Detected and Quantified 3 3 3
2010 METC Preliminary 42 43
2010 GLKN Detected and Quantified 3 3 3
2011 METC Preliminary 41 40
2011 GLKN Detected and Quantified 3 3 3
2012 METC Preliminary 34 42
2012 GLKN Detected and Quantified 3 3 3
2013 METC Preliminary 42 42
2013 GLKN Detected and Quantified 3 3 3
2014 METC Preliminary 42 42
2014 GLKN Detected and Quantified 4 4 4
2015 METC Valid 42 40
2015 GLKN Detected and Quantified 3 3 3
2016 METC Valid 42 42
2016 GLKN Detected and Quantified 3 3 3
2017 METC Valid 42 42
2017 GLKN Detected and Quantified 3 3 3
2018 METC Valid 42 42
2018 GLKN Detected and Quantified 3 3 3
2019 METC Valid 17 17
2019 GLKN Detected and Quantified 3 3 3
2020 METC Valid 3 3
2020 GLKN Detected and Quantified 3 3 3
2021 METC Valid 4 4
2021 GLKN Detected and Quantified 3 3 3
2022 GLKN Detected and Quantified 2 2 2
2023 GLKN Detected and Quantified 2 2 2
2023 GLKN Not Detected 1 1 1
2024 GLKN Detected and Quantified 3 3 3
SpecCond_uScm 2007 METC Preliminary 44 44
2007 GLKN Detected and Quantified 88 71 123
2008 METC Valid 39 37
2008 GLKN Detected and Quantified 34 21 94
2009 METC Preliminary 2 6
2009 METC Valid 41 37
2009 GLKN Detected and Quantified 91 72 123
2010 METC Valid 43 43
2010 GLKN Detected and Quantified 38 33 46
2011 METC Preliminary 8
2011 METC Valid 37 43
2011 GLKN Detected and Quantified 98 67 130
2012 METC Valid 34 42
2012 GLKN Detected and Quantified 33 26 46
2013 METC Valid 42 42
2013 GLKN Detected and Quantified 82 71 112
2014 METC Valid 42 42
2014 GLKN Detected and Quantified 104 83 131
2015 METC Valid 42 42
2015 GLKN Detected and Quantified 90 72 123
2016 METC Valid 42 42
2016 GLKN Detected and Quantified 91 84 125
2017 METC Valid 42 42
2017 GLKN Detected and Quantified 94 79 126
2018 METC Valid 44 41
2018 GLKN Detected and Quantified 83 74 113
2019 METC Valid 44 95 42
2019 GLKN Detected and Quantified 102 82 136
2020 METC Valid 37 73 36
2020 GLKN Detected and Quantified 66 62 89
2021 METC Valid 43 41
2021 GLKN Detected and Quantified 78 65 104
2022 METC Valid 42 42
2022 GLKN Detected and Quantified 82 77 113
2023 METC Valid 43 42
2023 GLKN Detected and Quantified 70 71 110
2024 METC Valid 43 42
2024 GLKN Detected and Quantified 71 60 112
TSS_mgL 2007 METC Preliminary 43 44
2007 GLKN Detected and Quantified 8 8 8
2008 METC Valid 41
2008 GLKN Detected and Quantified 3 3 6
2008 METC Preliminary 40
2009 METC Valid 43 43
2009 GLKN Detected and Quantified 8 8 7
2009 GLKN Not Detected 1
2010 METC Valid 42 42
2010 GLKN Detected and Quantified 3 3 3
2011 METC Valid 42 43
2011 GLKN Detected and Quantified 8 8 8
2012 METC Valid 34 42
2012 GLKN Detected and Quantified 3 3 3
2013 METC Valid 42 42
2013 GLKN Detected and Quantified 7 7 7
2014 METC Valid 42 41
2014 GLKN Detected and Quantified 9 9 9
2015 METC Valid 42 41
2015 GLKN Detected and Quantified 6 5 6
2015 GLKN Not Detected 2 3 1
2015 GLKN Present Below Quantification Limit 1
2015 METC Preliminary 1
2016 METC Valid 42 42
2016 GLKN Detected and Quantified 8 7 6
2016 GLKN Not Detected 1 2
2017 METC Valid 42 42
2017 GLKN Detected and Quantified 8 7 5
2017 GLKN Not Detected 1 3
2018 METC Valid 44 42
2018 GLKN Detected and Quantified 6 7 5
2018 GLKN Not Detected 1 1
2018 GLKN Present Below Quantification Limit 1
2019 METC Valid 44 42
2019 GLKN Detected and Quantified 8 8 7
2019 GLKN Not Detected 1
2020 METC Preliminary 1 1
2020 METC Valid 35 34
2020 GLKN Detected and Quantified 6 6 6
2021 METC Valid 43 42
2021 GLKN Detected and Quantified 7 6 6
2021 GLKN Not Detected 1 1
2022 METC Valid 42 42
2022 GLKN Detected and Quantified 7 7 6
2022 GLKN Not Detected 1
2023 METC Valid 43 42
2023 GLKN Detected and Quantified 6 8 6
2023 GLKN Present Below Quantification Limit 1 1
2023 GLKN Not Detected 1
2024 METC Valid 43 42
2024 GLKN Detected and Quantified 6 6 6
2024 GLKN Not Detected 1
TempWater_C 2007 METC Preliminary 44 45
2007 METC Valid 3 9 7 10
2007 GLKN Detected and Quantified 88 71 123
2008 METC Valid 41 3 10 8 9 41
2008 GLKN Detected and Quantified 34 21 94
2009 METC Preliminary 2 6
2009 METC Valid 41 8 11 3 13 14 37
2009 GLKN Detected and Quantified 91 72 123
2010 METC Valid 43 12 12 6 15 16 43
2010 GLKN Detected and Quantified 38 33 46
2011 METC Preliminary 8
2011 METC Valid 37 11 15 6 10 16 43
2011 GLKN Detected and Quantified 98 67 130
2012 METC Valid 34 10 2 16 6 6 16 42
2012 GLKN Detected and Quantified 33 26 46
2013 METC Valid 42 11 13 7 6 17 42
2013 GLKN Detected and Quantified 82 71 112
2014 METC Valid 42 8 12 4 4 14 42
2014 GLKN Detected and Quantified 104 83 131
2015 METC Valid 42 4 4 5 5 15 42
2015 GLKN Detected and Quantified 90 72 123
2016 METC Valid 42 5 5 5 5 16 42
2016 GLKN Detected and Quantified 91 84 125
2017 METC Valid 42 8 8 8 5 16 42
2017 GLKN Detected and Quantified 94 79 126
2018 METC Valid 44 41
2018 GLKN Detected and Quantified 83 74 113
2019 METC Valid 43 5 1 2 10 110 42
2019 GLKN Detected and Quantified 102 82 136
2020 METC Valid 37 9 88 36
2020 GLKN Detected and Quantified 66 62 89
2021 METC Valid 43 9 7 8 42
2021 GLKN Detected and Quantified 78 65 104
2022 METC Valid 42 10 4 8 42
2022 GLKN Detected and Quantified 82 77 113
2023 METC Valid 43 7 3 4 42
2023 GLKN Detected and Quantified 70 71 110
2024 METC Valid 43 9 2 6 42
2024 GLKN Detected and Quantified 71 60 112
Transp_cm 2007 GLKN Detected and Quantified 6 5 4
2007 GLKN Present Above Quantification Limit 3
2008 METC Preliminary 16 1
2008 GLKN Detected and Quantified 2 2 3
2008 GLKN Present Above Quantification Limit 1 1 3
2009 METC Preliminary 22
2009 GLKN Detected and Quantified 6 7 2
2009 GLKN Present Above Quantification Limit 2 1 6
2010 METC Preliminary 18 17
2010 GLKN Detected and Quantified 3 3 2
2010 GLKN Present Above Quantification Limit 1
2011 METC Preliminary 18 5
2011 METC Valid 2
2011 GLKN Detected and Quantified 8 7 2
2011 GLKN Present Above Quantification Limit 1 6
2012 METC Preliminary 22 27
2012 GLKN Detected and Quantified 3 3
2012 GLKN Present Above Quantification Limit 3
2013 METC Valid 30 27
2013 GLKN Detected and Quantified 5 5 1
2013 GLKN Present Above Quantification Limit 2 2 6
2013 METC Preliminary 3
2014 METC Preliminary 2
2014 METC Valid 33 37
2014 GLKN Detected and Quantified 6 4 5
2014 GLKN Not Reported 1 2
2014 GLKN Present Above Quantification Limit 1 2 3
2015 METC Valid 37 27
2015 GLKN Detected and Quantified 4 3 3
2015 GLKN Present Above Quantification Limit 4 5 5
2016 METC Valid 32 22
2016 GLKN Detected and Quantified 4 2 1
2016 GLKN Not Reported 1
2016 GLKN Present Above Quantification Limit 3 6 7
2016 METC Preliminary 4
2017 GLKN Detected and Quantified 5 1
2017 GLKN Present Above Quantification Limit 3 7 8
2018 GLKN Detected and Quantified 1 2
2018 GLKN Present Above Quantification Limit 6 5 6
2018 GLKN Not Reported 1
2019 GLKN Detected and Quantified 4 2 2
2019 GLKN Present Above Quantification Limit 4 6 6
2020 GLKN Detected and Quantified 5 3 4
2020 GLKN Present Above Quantification Limit 1 3 2
2021 GLKN Detected and Quantified 3 3
2021 GLKN Not Reported 2
2021 GLKN Present Above Quantification Limit 2 4 7
2023 GLKN Detected and Quantified 2 2 1
2023 GLKN Present Above Quantification Limit 5 6 7
2024 GLKN Detected and Quantified 5 3 1
2024 GLKN Present Above Quantification Limit 1 3 6
pH 2007 METC Preliminary 44 45
2007 GLKN Detected and Quantified 88 71 123
2008 METC Valid 41 40
2008 GLKN Detected and Quantified 34 21 94
2009 METC Preliminary 2 6
2009 METC Valid 41 37
2009 GLKN Detected and Quantified 91 72 123
2010 METC Valid 41 41
2010 GLKN Detected and Quantified 38 33 46
2011 METC Preliminary 8
2011 METC Valid 37 43
2011 GLKN Detected and Quantified 98 67 130
2012 METC Valid 34 42
2012 GLKN Detected and Quantified 33 26 46
2013 METC Valid 42 42
2013 GLKN Detected and Quantified 82 71 112
2014 METC Valid 42 42
2014 GLKN Detected and Quantified 104 83 131
2015 METC Valid 42 42
2015 GLKN Detected and Quantified 90 72 123
2016 METC Valid 42 42
2016 GLKN Detected and Quantified 91 84 125
2017 METC Valid 41 41
2017 GLKN Detected and Quantified 94 79 126
2018 METC Valid 44 41
2018 GLKN Detected and Quantified 83 74 113
2019 METC Valid 44 95 42
2019 GLKN Detected and Quantified 102 82 136
2020 METC Valid 37 73 36
2020 GLKN Detected and Quantified 55 53 74
2020 GLKN Not Reported 11 9 15
2021 METC Valid 43 42
2021 GLKN Detected and Quantified 78 65 104
2022 METC Valid 42 42
2022 GLKN Detected and Quantified 82 77 113
2023 METC Valid 43 42
2023 GLKN Detected and Quantified 70 71 110
2024 METC Valid 43 42
2024 GLKN Detected and Quantified 71 60 112

Sampling Heat Maps

All Params
Main takeaways from sampling intensity:
  • METC_STCR_23.6 and METC_STCR_0.1 collect the most data, and at a greater frequency within years than GLKN sites.
  • METC_STCR_22.5, METC_STCR_22.3, METC_STCR_16.6, METC_STCR_15.3, METC_STCR_12.4, METC_STCR_11.3 primarily only have water temperature, Secchi depth, and ChlA associated with them.
  • Parameters and sites to model for redundancy analyses
    • ChlA_ugL: Good overlap among all sites and years. Analyze for all sites.
    • Cl_mgL: Analyze for METC_23.6 and METC_0.1 (~continuous record); periodic samples in GLKN sites.
    • DO_mgL: Analyze for METC_23.6 and METC_0.1 (~continuous record); good coverage in GLKN sites.
    • NO2+NO3_ugL: Analyze for METC_23.6 and METC_0.1 starting in 2018.
    • P_ugL: Analyze for METC_23.6 and METC_0.1 (~continuous record); periodic samples in GLKN sites .
    • pH: Analyze for METC_23.6 and METC_0.1 (~continuous record); good coverage in GLKN sites.
    • Secchi_m: Good overlap among all sites and years. Analyze for all sites.
    • SO4_mgL: Analyze for METC_23.6 and METC_0.1 (biweekly until 2019); periodic samples in GLKN sites. no new records since 2016.
    • SpecCond_uScm: Analyze for METC_23.6 and METC_0.1 (~continuous record); good coverage in GLKN sites.
    • TempWater_C: Good overlap among all METC sites and years. Analyze for all sites. SACN_15.8 is the only GLKN site with data past 2010.
    • TSS_mgL: Analyze for METC_23.6 and METC_0.1 (~continuous record); periodic samples in GLKN sites.
  • Parameters that may be included after first cut:
    • Alkalinity_mgL: METC started collecting in METC_23.6 and METC_0.1 in 2017, but GLKN has record back to 2007.
    • DOC_mgL: METC started collecting in METC_23.6 and METC_0.1 in 2016, but GLKN has record back to 2007.
  • Parameters dropped from redundancy analysis with notes:
    • Ca_mgL: No new samples since 2020 in METC.
    • DOsat_pct: Not sampled by METC. Will use DO_mgL.
    • K_mgL: No new samples since 2020 in METC.
    • Mg_mgL: No new samples since 2020 in METC.
    • Na_mgL: No new samples since 2020 in METC.
    • NH4_ugL: No new samples since 2014 in METC.
    • Si_mgL: No new samples since 2021 in METC.
    • Transp_cm: No new samples since 2016in METC.
Heatmap of sampling intensity by site, parameter and year, such that blue bins were sampled 1-5 times within a given year and site, red bins more than 20 samples within a given year, etc. Sites or ordered by stream mile. Sites with METC are monitored by the MetCouncil. Sites with SACN are monitored by GLKN.

Heatmap of sampling intensity by site, parameter and year, such that blue bins were sampled 1-5 times within a given year and site, red bins more than 20 samples within a given year, etc. Sites or ordered by stream mile. Sites with METC are monitored by the MetCouncil. Sites with SACN are monitored by GLKN.

Alkalinity_mgL

Heat map of number of samples collected in each year by site for Alkalinity_mgL and detection status.

Ca_mgL

Heat map of number of samples collected in each year by site for Ca_mgL and detection status.

ChlA_ugL

Heat map of number of samples collected in each year by site for ChlA_ugL and detection status.

Cl_mgL

Heat map of number of samples collected in each year by site for Cl_mgL and detection status.

DO_mgL

Heat map of number of samples collected in each year by site for DO_mgL and detection status.

DOC_mgL

Heat map of number of samples collected in each year by site for DOC_mgL and detection status.

DOsat_pct

Heat map of number of samples collected in each year by site for DOsat_pct and detection status.

K_mgL

Heat map of number of samples collected in each year by site for K_mgL and detection status.

Mg_mgL

Heat map of number of samples collected in each year by site for Mg_mgL and detection status.

Na_mgL

Heat map of number of samples collected in each year by site for Na_mgL and detection status.

NH4_ugL

Heat map of number of samples collected in each year by site for NH4_ugL and detection status.

NO2+NO3_ugL

Heat map of number of samples collected in each year by site for NO2+NO3_ugL and detection status.

P_ugL

Heat map of number of samples collected in each year by site for P_ugL and detection status.

pH

Heat map of number of samples collected in each year by site for pH and detection status.

Secchi_m

Heat map of number of samples collected in each year by site for Secchi_m and detection status.

Si_mgL

Heat map of number of samples collected in each year by site for Si_mgL and detection status.

SO4_mgL

Heat map of number of samples collected in each year by site for SO4_mgL and detection status.

SpecCond_uScm

Heat map of number of samples collected in each year by site for SpecCond_uScm and detection status.

TempWater_C

Heat map of number of samples collected in each year by site for TempWater_C and detection status.

Transp_cm

Heat map of number of samples collected in each year by site for Transp_cm and detection status.

TSS_mgL

Heat map of number of samples collected in each year by site for TSS_mgL and detection status.

Discharge

Final Dataset

final_dat <- full_dat |> filter(month %in% 4:11)

dt <- datatable(final_dat, 
                class = 'cell-border stripe', rownames = F, width = '1200px',
                extensions = c("Buttons"),
                options = list(       
                            initComplete = htmlwidgets::JS(
                            "function(settings, json) {",
                              "$('body').css({'font-size': '11px'});",
                              "$('body').css({'font-family': 'Arial'});",
                              "$(this.api().table().header()).css({'font-size': '11px'});",
                              "$(this.api().table().header()).css({'font-family': 'Arial'});",
                            "}"),
                pageLength = 50, autoWidth = TRUE, scrollX = TRUE, scrollY = '600px',
                scrollCollapse = TRUE, lengthMenu = c(5, 10, 50, nrow(full_dat)),
                fixedColumns = list(leftColumns = 1),
                dom = "Blfrtip", buttons = c('copy', 'csv', 'print')),
                filter = list(position = 'top', clear = FALSE)#,
                )

dt

Analysis

Final Params by Site Matrix

Creating data frame of site and parameter matrix for easier iteration later

all_sites <- sort(unique(full_dat$site_order))
metc2_sacn <- c("METC_STCR_23.6", "SACN_STCR_20.0", "SACN_STCR_15.8", "SACN_STCR_2.0", "METC_STCR_0.1")

alk <- data.frame(Location_ID = metc2_sacn, param_name = "Alkalinity_mgL")
chla <- data.frame(Location_ID = all_sites, param_name = "ChlA_ugL")
cl <- data.frame(Location_ID = metc2_sacn, param_name = "Cl_mgL")
doc <- data.frame(Location_ID = metc2_sacn, param_name = "DOC_mgL")
do <- data.frame(Location_ID = metc2_sacn, param_name = "DO_mgL")
nox <- data.frame(Location_ID = metc2_sacn, param_name = "NO2+NO3_ugL")
p <- data.frame(Location_ID = metc2_sacn, param_name = "P_ugL")
ph <- data.frame(Location_ID = metc2_sacn, param_name = "pH")
sec <- data.frame(Location_ID = all_sites, param_name = "Secchi_m")
so4 <- data.frame(Location_ID = metc2_sacn, param_name = "SO4_mgL")
sc <- data.frame(Location_ID = metc2_sacn, param_name = "SpecCond_uScm")
tmp <- data.frame(Location_ID = all_sites, param_name = "TempWater_C")
tss <- data.frame(Location_ID = metc2_sacn, param_name = "TSS_mgL")

param_site_mat <- rbind(alk, chla, cl, doc, do, nox, p, ph, sec, so4, sc, tmp, tss)